Image

Finding the Best Pokemon Team Composition

An Inspirational CMSC320 Project by Devan Tamot

Introduction

Hello there! It's so very nice to meet you! Welcome to the world of Pokemon! My name is Devan Tamot, however everyone just calls me the Pokemon Professor. This world is widely inhabited by creatures known as Pokemon. We Humans live alongside Pokemon as friends. At times we play together, and at other times we work together. Some people use their Pokemon to battle and develop closer bonds with them. What do I do? I conduct data science research so we learn more about Pokemon and how to best utilize their strengths and weaknesses.

As an well established Pokemon Professor who happens to be a Computer Science genius, I wanted to make this project about finding the optimal team given various senarios such as a set of pokemon to battle, a set of elements, etc. When first playing a Pokemon game, it can be quite overwhelming at first because so much information is being thrown at you. My hopes is that this notebook will help anyone who is looking for the most optimal team so they can become a Pokemon Master!

This project does not consider movesets that act as buffs for pokemon (aka moves that enable special stats like a power up, defense up, etc..)

Getting Started with the Data

I used Python3 and the following libraries: requests, io, pandas, pprint, graphviz

In [1]:
# These are the nessessary initalization imports needed
import requests, io
import pandas as pd
import pprint, random
from graphviz import Digraph 
# Must run the following commands in docker: 
# 1. pip install graphviz 
# 2. conda install python-graphviz
import numpy as np
import pandas as pd 
import matplotlib as plt 

import seaborn as sns 
import random 

# These aren't directly important, but they are just for looks
from ipywidgets import IntProgress
import ipywidgets as widgets
from IPython.display import Image, display
import time, psutil

The data set I will be using is a RESTful API called the PokeApi which contains JSON data about all things Pokemon.

In [2]:
j = requests.get('https://pokeapi.co/api/v2/pokemon/').json()
# pprint.pprint(j) # To see the json data

Elemental Dependency Digraph

In order to effectively calculate the most effective team, we need some sort of data structure to maintain the relationships between the elemental types. An edge connection from one node to another means that the starting elemental type is super effective against the ending elemental node. One way to represent this is as a Weighted Digraph as such: Image

Lucky for me, PokeApi has infomation regarding the elemental types so creating the weighted digraph is no issue. If no mapping exists between one element to another, then it is assumed the damage factor is 1.

In [3]:
wed = dict() # Weighted Elemental Digraph 
# Function for adding relationships to wed
# Parameters: from, to, weight
def addToWed(f,t,w):
    if(f not in wed):
        wed[f] = dict()
    wed[f][t]=w
def getFactor(f,t):
    if t not in wed[f]:
        return 0
    return wed[f][t]
# returns a list of elements that it is super effective against
def getTimesTwoElements(f):
    returnList = []
    for e in wed[f]:
        if wed[f][e] == 2:
            returnList.append(e)
    return returnList

j = requests.get('https://pokeapi.co/api/v2/type/').json()
    
for t in j['results'][:-2]:
    type_info = requests.get('https://pokeapi.co/api/v2/type/'+t['name']).json()
    for k in type_info['damage_relations']['double_damage_from']:
        addToWed(k['name'],t['name'],2)
    for k in type_info['damage_relations']['double_damage_to']:
        addToWed(t['name'],k['name'],2)
    for k in type_info['damage_relations']['half_damage_from']:
        addToWed(k['name'],t['name'],0.5)
    for k in type_info['damage_relations']['half_damage_to']:
        addToWed(t['name'],k['name'],0.5)
    for k in type_info['damage_relations']['no_damage_from']:
        addToWed(k['name'],t['name'],0)
    for k in type_info['damage_relations']['no_damage_to']:
        addToWed(t['name'],k['name'],0)

Just to prove that the digraph just created produced the proper digraph, here is a graph visualization down below.

In [4]:
dom_graph = Digraph(name='dom-graph',strict=True,graph_attr={'fontsize':'30pt'})
dom_graph.attr(label='Elemental Counter Graph',layout="circo",overlap='false')

for k,v in wed.items():
    dom_graph.node(k,k.capitalize())
    for e in getTimesTwoElements(k):
        dom_graph.node(e,e.capitalize())
        ed = dom_graph.edge(k,e)
       
dom_graph.render('graph-viz/elemental_graph.gv', view=True,format='png')
display(Image(filename='graph-viz/elemental_graph.gv.png',width='700pt',height='700pt'))

Precomputation of Data

In order to prevent continually looking up pokemon data when we need it, I decided to precompute the data by populating a python dictonary with the nessessary information needed.

Attack Move Data

The first step in finding the best pokemon team is to the gather information about each move. This includes its name, elemental type, and power.

In [5]:
# Precompuation of the moves from the API is nessessary to avoid having to make
# Multiple calls
moveDict = dict() # Dictionary of moves [name|->(element, damage)]
j = requests.get('https://pokeapi.co/api/v2/move').json()

print('Loading all Move Data ... ')
max_count = j['count']
f = IntProgress(min=0, max=max_count) # instantiate the bar
display(f) # display the bar

while(j['next']!=None):
    for m in j['results']:
        f.value += 1 # signal to increment the progress bar
        mInfo = requests.get(m['url']).json()
        if(mInfo['power'] != None):
            moveDict[m['name']]=(mInfo['type']['name'],mInfo['power'])
    j=requests.get(j['next']).json()
print('Loaded',max_count,'Moves!')
Loading all Move Data ... 
Loaded 746 Moves!

Pokemon Data

Legendary Pokemon

These pokemon are considered legendary according to the Pokemon Lore. For most pokemon battle competitions, the use of Legendary pokemon is banned. Therefore, I offer the option to filter out lengendary pokemon for finding an optimal team.

Pokemon Data Structure

In order to maintain all pokemon information in a useful neat way, I created a Pokemon object type that maintains a Pokemon's name, id, elemental types, and moves.

Creating the Complete Pokemon Roster

As before with the move dictionary, I front load all the GET requests a head of time to prevent a culter of GET calls later on. This takes quite a bit of time to load all 800+ Pokemon.
In [6]:
# Maintains a list of all legendary Pokemon IDs
legendaryIDList = [144, 145, 146, 150, 151, 243, 244, 245, 249, 250, 
                   251, 377, 378, 379, 380, 381, 382, 383, 384, 385, 
                   386, 480, 481, 482, 483, 484, 485, 486, 487, 488, 
                   489, 490, 491, 492, 493, 494, 638, 639, 640, 641, 
                   642, 643, 644, 645, 646, 647, 648, 649]
elAllList = ['normal','fighting','flying','poison','ground','rock','bug','ghost','steel',
          'fire','water','grass','electric','psychic','ice','dragon','dark','fairy']
class Pokemon:
    def __init__(self,name,idn,elements,moves,atk,spd,dfn):
        self.name = name
        self.idn = idn
        self.elements = elements
        self.moves = moves # dict mapping [moveName|->(element,damage)]
        self.atk = atk
        self.spd = spd
        self.dfn = dfn
        sum=0
        for m,s in moves.items():
            for e in elAllList:
                sum += getFactor(s[0],e)
        self.sum = sum
        
    def getElementalSum(e):
        sum = 0
        for m in moves:
            sum += getFactor(m,e)
        return sum
    def getType(self):
        return '/'.join(self.elements)

# Now we begin creating the ultimate pokemon roster
pkmRoster = dict()

j = requests.get('https://pokeapi.co/api/v2/pokemon').json()

print('Loading all Pokemon Data ...')

max_count = 807
f = IntProgress(min=0, max=max_count) # instantiate the bar
display(f) # display the bar

while(j['next']!=None):
    for p in j['results']:
        f.value += 1 # signal to increment the progress bar
        pinfo = requests.get('https://pokeapi.co/api/v2/pokemon/'+p['name']).json()
        if(pinfo['id'] < 808):
            types = []
            for t in pinfo['types']:
                types.append(t['type']['name'])
            moves=dict()
            for m in pinfo['moves']:
                if m['move']['name'] in moveDict:
                    moves[m['move']['name']] = moveDict[m['move']['name']]
            pokeObj = Pokemon(p['name'],pinfo['id'],types,moves,pinfo['stats'][4]['base_stat'],pinfo['stats'][0]['base_stat'],pinfo['stats'][3]['base_stat'])
            #pprint.pprint(moves)
            pkmRoster[p['name']] = pokeObj
    j=requests.get(j['next']).json()

print('Loaded',max_count,"Pokemon!")
Loading all Pokemon Data ...
Loaded 807 Pokemon!

Exploratory Data Analysis

Elemental Type Background

Pokemon have various different types and each type has it's strengths and weaknesses. The image below on the left shows the different elemental types and the arrows from each type indicate which other type it is most effective against in battle. Pokemon are known to have up to two types.
Type Weakness Chart Example of Different Types of Pokemon
Image Image

When Pokemon are in battle, the attacking Pokemon can choose an attack from a moveset of 4 different attacks. These attacks do not nessessiarily have to be the same type of the pokemon, they can be different types. When an attack is executed on a defending Pokemon, the amount of damage done to the defending Pokemon depends on the elemental type of the attack and the elemental type of the defending Pokemon.

With these elemental strengths known, we can actually look up the weighted effectiveness of each elemental moveset against the others. The image below shows the elemental relationships:

Image

For example, based on the graph we see that water types do times 2 damage against Fire, Ground, and Rock type pokemon.

Algorithms

getSortedElList()

Parameters:

elList: a list of elements that need to be countered
noLegend: determines if we want to include legendary pokemon or not

Output:

A list of tuples in the form (pokemon_name,move_sum)

Algorithm:

This function simply gets a sorted list of the pokemon with the moveset that will best counter the element list provided

           fun getSortedElList(elList, noLegend):
                Let L be a sorted list that will maintain the top ranked pokemon
                for each pokemon in the pokemon roster:
                  If the pokemon satisfies the noLegend parameter:
                      count all the elements the pokemon can counter with it's moves
                      then append (pokemon_name,sum) to L
                  end if
                end for
                return a L sorted by the sum of each pokemon
            end fun

getBestMoveSet()

Parameters:

P: a list of n pokemon that will be used to counter the elements in elList
elList: a list of elements that need to be countered shareMoves: a boolean variable that will determine if Pokemon in the list are allowed to share moves

Output:

A dictionary that maps [Pokemon]→[list_of_move_names]]

Algorithm:

This function simply maps a set of moves to each pokemon provided

            fun getBestMoveSet(P, elList):
                Let E be an empty dictionary which maps elemental types in elList to a list of tulples
                let P be an empty dictionary which maps a pokemon from P to a list of moves
                for each pokemon in P:    
                    for each of the pokemon's moves:
                        if it is effective against any of the elements e in elList:
                            E[e] = (pokemon_name,move_name,move_damage)
                        end if
                    end for
                end for
for each key e in E: sort E[e] based on attack power end for for each of the element keys in E starting with the one with the least moves in it's move list: Take the top move from it's move list: If the pokemon that has that move still has room for moves: add that move to the pokemon's move list in P exit the loop else: remove the move and keep looking end if end loop end for return P end fun
In [7]:
legendaryIDList = [144, 145, 146, 150, 151, 243, 244, 245, 249, 250, 
                   251, 377, 378, 379, 380, 381, 382, 383, 384, 385, 
                   386, 480, 481, 482, 483, 484, 485, 486, 487, 488, 
                   489, 490, 491, 492, 493, 494, 638, 639, 640, 641, 
                   642, 643, 644, 645, 646, 647, 648, 649]
# get the last key. 
def last(n): 
    return n[-1]   
   
# function to sort the tuple    
def sort(tuples): 
  
    # We pass used defined function last 
    # as a parameter.  
    return sorted(tuples, key = last) 

# This function ranks all pokemon based on how effective they are against a list of elements
def getSortedElList(elList,noLegend):
    L = [] # A sorted list that will maintain the top ranked pokemon for this list
    for p,v in pkmRoster.items():
        sum = 0
        if(noLegend):
            if (v.idn not in legendaryIDList):
                for m,s in v.moves.items():
                    for e in elList:
                        sum += getFactor(s[0],e)
                L.append((p,sum))
        else:
            for m,s in v.moves.items():
                for e in elList:
                    sum += getFactor(s[0],e)
            L.append((p,sum))
            
    
    return sort(L)

# Given a set of Pokemon, this computes the best move set to counter a given set of elemental types
def getBestMoveSet(P,elList,shareMoves=True):
    
    # Create empty buckets for each pokemon
    elBuckets = dict()
    for e in elList:
        elBuckets[e] = []
    
    # go through each of the pokemons moves and determine if it has any 
    # that are super effective against the elList elements
    pBuckets = dict()
    for p in P:
        pBuckets[p] = []
        poke = pkmRoster[p]
        for m,v in poke.moves.items():
            superEff = getTimesTwoElements(v[0])
            for s in superEff:
                if s in elList:
                    elBuckets[s].append((p,m,v[1]))
    
    # sort each move based on it's attack power
    for e,l in elBuckets.items():
        elBuckets[e] = sort(elBuckets[e])
    
    # Now selects which pokemon should get each one 
    elPriorityList = []
    for el in elList:
        elPriorityList.append((el,len(elBuckets[el])))
    elPriorityList = sort(elPriorityList) 
    
    moveList = []
    # now pairing the right movesets to each pokemon in the team
    for el in elPriorityList:
        if(len(elBuckets[el[0]]) != 0):
            count = -1
            validMove = False

            while(not validMove):
                potentialMove = elBuckets[el[0]][count]
                if (len(pBuckets[potentialMove[0]]) != 4) and (potentialMove[1] not in pBuckets[potentialMove[0]]):
                    if(potentialMove[1] in moveList): 
                        if(shareMoves):
                            pBuckets[potentialMove[0]].append(potentialMove[1])
                            del elBuckets[el[0]][count] # removes the move from the list
                    else:
                        pBuckets[potentialMove[0]].append(potentialMove[1])
                        moveList.append(potentialMove[1])
                        del elBuckets[el[0]][count] # removes the move from the list
                    validMove = True
                else:
                    count -= 1
                    # incase we cannot find a move that satisfies this
                    if len(elBuckets[el[0]])+count < 0:
                        validMove = True

    return pBuckets

# creates and renders a graph based on the pokemon list
def createGraph(elList,pBuck,fileName,graphLabel='Pokemon Move to Type Dominance'):
    # Creating graphical list
    moveGraph = Digraph(name='move-graph',strict=True,graph_attr={'fontsize':'30pt'})
    moveGraph.attr(label=graphLabel)
    
    # add the elemental types to counter
    for e in elList:
        moveGraph.node(e,e.capitalize())

    # Add the pokemon
    for p,m in pBuck.items():
        moveGraph.node(p,p.capitalize()+'\n('+pkmRoster[p].getType()+')')
        for subM in m:

            moveGraph.node(subM,subM.capitalize()+'\n('+moveDict[subM][0]+')')
            moveGraph.edge(p,subM,constraint='true',concentrate='false')

            # Now we add the type dominances per move
            domElements = getTimesTwoElements(moveDict[subM][0])
            for dE in domElements:
                if dE in elList:
                    moveGraph.edge(subM,dE,constraint='true',concentrate='false')
    # Render the graph
    moveGraph.render(fileName, view=True,format='png')

Using the functions from above, we can break it down into these simple steps:

  1. Sort all pokemon based on how effective their possible moves are against a set of elements
  2. Find the most powerful team moveset by prioritizing the elements that

Note that some Pokemon do not have all 4 moves filled. This is due to all the elements being fufilled by the current moveset, therefore, it would be unnessessary for the other pokemon to have any extra moves. This essentially means those pokemon can have any moves.

In [8]:
elList = ['normal','fighting','flying','poison','ground','rock','bug','ghost','steel',
          'fire','water','grass','electric','psychic','ice','dragon','dark','fairy']

P = getSortedElList(elList,noLegend=True)
pBuck = getBestMoveSet([i[0] for i in P[-6:]],elList,shareMoves=False)
createGraph(elList,pBuck,'graph-viz/pokeGraphNoLegend.gv',graphLabel='Optimal Unique Team/Moves no Legendaries')

PL = getSortedElList(elList,noLegend=False)
pBuckL = getBestMoveSet([i[0] for i in PL[-6:]],elList,shareMoves=False)
createGraph(elList,pBuckL,'graph-viz/pokeGraphWithLegend.gv',graphLabel='Optimal Unique Team/Moves with Legendaries')

listOfImageNames = ['graph-viz/pokeGraphNoLegend.gv.png',
                    'graph-viz/pokeGraphWithLegend.gv.png']

for imageName in listOfImageNames:
    display(Image(filename=imageName))

The Graphs below demonstrates the same as above except Pokemon are allowed to have the same move sets as one another

In [9]:
P = getSortedElList(elList,noLegend=True)
pBuck = getBestMoveSet([i[0] for i in P[-6:]],elList,shareMoves=True)
createGraph(elList,pBuck,'graph-viz/pokeGraphNoLegend.gv',graphLabel='Optimal Team/Moves no Legendaries')

PL = getSortedElList(elList,noLegend=False)
pBuckL = getBestMoveSet([i[0] for i in PL[-6:]],elList,shareMoves=True)
createGraph(elList,pBuckL,'graph-viz/pokeGraphWithLegend.gv',graphLabel='Optimal Team/Moves with Legendaries')

listOfImageNames = ['graph-viz/pokeGraphNoLegend.gv.png',
                    'graph-viz/pokeGraphWithLegend.gv.png']

for imageName in listOfImageNames:
    display(Image(filename=imageName))

Just for fun, I created a custom team selector that allows you to pick any team of pokemon and find the optimal moveset for that team.

In [10]:
current_display = False
elAllList = ['normal','fighting','flying','poison','ground','rock','bug','ghost','steel',
          'fire','water','grass','electric','psychic','ice','dragon','dark','fairy']
capPokeRoster = list(map(lambda x:x.capitalize(),list(pkmRoster.keys())))

def on_team_click(s):
    global current_display
    team = []
    for i in range(6):
        if items[i].value != '' and items[i].value.lower() in pkmRoster.keys():
            team.append(items[i].value.lower())
    pBuck = getBestMoveSet(team,elAllList,shareMoves=checkBox.value)
    createGraph(elList,pBuck,'graph-viz/customTeam.gv',graphLabel='Custom Team Comp')
    
    if(current_display):
        for proc in psutil.process_iter():
            if proc.name() == "display":
                proc.kill()
          
    file = open("graph-viz/customTeam.gv.png", "rb")
    image = file.read()
    wimage = widgets.Image(
        value=image,
        format='png',
        width=1000,
        height=700,
    )
    current_display = True
    display(wimage)
    
def on_rand_click(s):
    for i in range(6):
        items[i].value = capPokeRoster[random.randint(0,len(capPokeRoster))]
    
items = [widgets.Combobox(
        value=None,
        placeholder='Choose a Pokemon',
        options=capPokeRoster,
        description='Pokemon # '+str(i+1),
        ensure_option=True,
        disabled=False
    ) for i in range(6)]

checkBox = widgets.Checkbox(
    value=False,
    description='Share Moves',
    disabled=False)

butt = widgets.Button(description="View Move Graph")
randButt = widgets.Button(description="Choose Random",color='blue')
items.append(widgets.HBox([butt,randButt,checkBox]))
butt.on_click(on_team_click)
randButt.on_click(on_rand_click)
widgets.VBox(items)

Now that we have some form of visual representation, lets dive into win percentages of pokemon and see statisics says about our team composition.

Getting back into the Data

To figure out win percentages, I use some data that I found online. Unfortunetly, the Pokemon data does not align exactly to the offical pokedex, so I had to use theirs to maintain consistency.

The battle data information was crafted from a battle algorithm that ran simulations for battles.

In [11]:
combat = pd.read_csv("combats.csv")
pokeDF = pd.read_csv("pokemon.csv")

for index, row in pokeDF.iterrows():
    if type(row['Name']) == type('') and row['Name'].lower() in pkmRoster:
        pokeDF.loc[index,'move_sum'] = pkmRoster[row['Name'].lower()].sum

pokeDF.head()
Out[11]:
# Name Type 1 Type 2 HP Attack Defense Sp. Atk Sp. Def Speed Generation Legendary move_sum
0 1 Bulbasaur Grass Poison 45 49 49 65 65 45 1 False 197.0
1 2 Ivysaur Grass Poison 60 62 63 80 80 60 1 False 151.0
2 3 Venusaur Grass Poison 80 82 83 100 100 80 1 False 207.0
3 4 Mega Venusaur Grass Poison 80 100 123 122 120 80 1 False NaN
4 5 Charmander Fire NaN 39 52 43 60 50 65 1 False 358.5

Now to calculate win percentages for each Pokemon.

In [12]:
# This will calculate the win percentages of each pokemon and add it to the data set
total_Wins = combat.Winner.value_counts()
# get the number of wins for each pokemon
numberOfWins = combat.groupby('Winner').count()
countByFirst = combat.groupby('Second_pokemon').count()
countBySecond = combat.groupby('First_pokemon').count()

numberOfWins['Total Fights'] = countByFirst.Winner + countBySecond.Winner
numberOfWins['Win Percentage']= numberOfWins.First_pokemon/numberOfWins['Total Fights']

# merge the win percentage dataset with the original dataset
pokeDF = pd.merge(pokeDF, numberOfWins, left_on='#', right_index = True, how='left')

pokeDF.head()
Out[12]:
# Name Type 1 Type 2 HP Attack Defense Sp. Atk Sp. Def Speed Generation Legendary move_sum First_pokemon Second_pokemon Total Fights Win Percentage
0 1 Bulbasaur Grass Poison 45 49 49 65 65 45 1 False 197.0 37.0 37.0 133.0 0.278195
1 2 Ivysaur Grass Poison 60 62 63 80 80 60 1 False 151.0 46.0 46.0 121.0 0.380165
2 3 Venusaur Grass Poison 80 82 83 100 100 80 1 False 207.0 89.0 89.0 132.0 0.674242
3 4 Mega Venusaur Grass Poison 80 100 123 122 120 80 1 False NaN 70.0 70.0 125.0 0.560000
4 5 Charmander Fire NaN 39 52 43 60 50 65 1 False 358.5 55.0 55.0 112.0 0.491071

We can actually see the win percentages per elemental type.

In [13]:
grouped = pokeDF.groupby('Type 1').agg({"Win Percentage": "mean"}).sort_values(by = "Win Percentage")
grouped
Out[13]:
Win Percentage
Type 1
Fairy 0.329300
Rock 0.404852
Steel 0.424529
Poison 0.433262
Bug 0.439006
Ice 0.439604
Grass 0.440364
Water 0.469357
Fighting 0.475616
Ghost 0.484027
Normal 0.535578
Ground 0.541526
Psychic 0.545747
Fire 0.579215
Dark 0.629726
Electric 0.632861
Dragon 0.633587
Flying 0.765061

Data Analysis and Linear Regression Modeling

Now that we have a visual way of seeing how a team will match up against the different elemental types, lets see which other factors will help a Pokemon team win a battle.

To make things interesting, lets call back onto our elemental digraph to see correlation between win percentage and elemental advantage.

In [14]:
new_dict = dict.fromkeys(elAllList, 0)
for p,o in pkmRoster.items():
    new_dict[o.elements[0]] += o.sum
typeDF = pd.DataFrame(columns=['Type','Win Percentage','Sum'])
for e in elAllList:
    typeDF = typeDF.append({'Type':e,'Win Percentage':grouped.loc[e.capitalize()]['Win Percentage'],'Sum':new_dict[e]},ignore_index=True)
typeDF.head(len(elAllList))
Out[14]:
Type Win Percentage Sum
0 normal 0.535578 15826.0
1 fighting 0.475616 13518.5
2 flying 0.765061 18823.0
3 poison 0.433262 8650.0
4 ground 0.541526 11774.0
5 rock 0.404852 6354.0
6 bug 0.439006 2933.5
7 ghost 0.484027 4537.0
8 steel 0.424529 5110.5
9 fire 0.579215 9643.5
10 water 0.469357 17090.0
11 grass 0.440364 10125.0
12 electric 0.632861 5830.5
13 psychic 0.545747 12350.5
14 ice 0.439604 4851.0
15 dragon 0.633587 6383.0
16 dark 0.629726 6236.0
17 fairy 0.329300 7885.0
In [15]:
# typeDF.plot(x="Sum",y="Win Percentage",style='o')
sns.regplot(x="Sum", y="Win Percentage", data=typeDF, fit_reg =True)#.set_title("Elemental Move Sum vs Win Percentage")
sns.lmplot(x="Sum", y="Win Percentage", data=typeDF, hue = 'Type',fit_reg =False)#.set_title("Elemental Move Sum vs Win Percentage")
Out[15]:
<seaborn.axisgrid.FacetGrid at 0x7f716811ae48>

The plot above shows the relation between the total elemental effectiveness of each Pokemon per type. As we can see, the plot points show the higher the elemental sum is for a pokemon of a certain type, the higher chance your pokemon has to win.

Going back to the optimal team from above, let us use this new data to find the winning percentage of a given team composition. To do that, we need to consider what stats we want to use based on how each stat correlates to the Winning Percentage.

As we can see below, the Speed and Attack stat seems to make the best linear regression models. Defense is not the worst, however it does not look as sharp as the speed and attack stats, so we will not use defense in our prediction.

Since we will not be using defense, we will also not being using the Sum to Win Percentage regression from above.

In [16]:
ax_3pairs = sns.pairplot(pokeDF, x_vars=['Speed','Attack','Defense'], y_vars='Win Percentage', size=7, aspect=0.7, kind='reg')
ax_3pairs.fig.suptitle('Win Percentage vs. Speed, Attack, and Defense', y=1.03)
/opt/conda/lib/python3.7/site-packages/seaborn/axisgrid.py:2065: UserWarning: The `size` parameter has been renamed to `height`; pleaes update your code.
  warnings.warn(msg, UserWarning)
Out[16]:
Text(0.5, 1.03, 'Win Percentage vs. Speed, Attack, and Defense')

Below, we compute the slope and intercept for the Speed and Attack linear regression equations.

In [17]:
import scipy
nanFree = pokeDF.dropna(axis=0)
speed_reg = scipy.stats.linregress(nanFree['Speed'], nanFree['Win Percentage'])
attack_reg = scipy.stats.linregress(nanFree['Attack'], nanFree['Win Percentage'])
print(speed_reg.slope,speed_reg.intercept)
print(attack_reg.slope,attack_reg.intercept)
0.008445397023413384 -0.06577521681567766
0.0037811902415860626 0.21119942359952426

Now that we have some way of predicting winning percentages for pokemon based on their stats, lets have a way to compute the winning percentages of a given pokemon team. The Team Winning percentage is based on the average winning percentage of the entire team.

In [18]:
current_display = False
elAllList = ['normal','fighting','flying','poison','ground','rock','bug','ghost','steel',
          'fire','water','grass','electric','psychic','ice','dragon','dark','fairy']
capPokeRoster = list(map(lambda x:x.capitalize(),list(pkmRoster.keys())))

def compute_win(t):
    win_sum = 0
    for p in t:
        win_sum += pkmRoster[p].spd * speed_reg.slope + speed_reg.intercept
        win_sum += pkmRoster[p].atk * attack_reg.slope + attack_reg.intercept
    win_sum/= (2*len(t))
    return win_sum*100

def on_team_click(s):
    global current_display
    team = []
    for i in range(6):
        if items[i].value != '' and items[i].value.lower() in pkmRoster.keys():
            team.append(items[i].value.lower())

    pBuck = getBestMoveSet(team,elAllList,shareMoves=checkBox.value)
    createGraph(elList,pBuck,'graph-viz/customTeam.gv',graphLabel='Custom Team Comp: '+str(round(compute_win(team),2))+'% change of winning')
    
    if(current_display):
        for proc in psutil.process_iter():
            if proc.name() == "display":
                proc.kill()
          
    file = open("graph-viz/customTeam.gv.png", "rb")
    image = file.read()
    wimage = widgets.Image(
        value=image,
        format='png',
        width=1000,
        height=800,
    )
    current_display = True
    display(wimage)
    
def on_rand_click(s):
    for i in range(6):
        items[i].value = capPokeRoster[random.randint(0,len(capPokeRoster))]
    
items = [widgets.Combobox(
        value=None,
        placeholder='Choose a Pokemon',
        options=capPokeRoster,
        description='Pokemon # '+str(i+1),
        ensure_option=True,
        disabled=False
    ) for i in range(6)]

checkBox = widgets.Checkbox(
    value=False,
    description='Share Moves',
    disabled=False)

butt = widgets.Button(description="View Move Graph")
randButt = widgets.Button(description="Choose Random",color='blue')
items.append(widgets.HBox([butt,randButt,checkBox]))
butt.on_click(on_team_click)
randButt.on_click(on_rand_click)
widgets.VBox(items)

Now that we have some method of predicting the probability of a team composition winning, lets see how our previously determined optimal team composition does against this prediction model.

In [19]:
elList = ['normal','fighting','flying','poison','ground','rock','bug','ghost','steel',
          'fire','water','grass','electric','psychic','ice','dragon','dark','fairy']

P = getSortedElList(elList,noLegend=True)

pBuck = getBestMoveSet([i[0] for i in P[-6:]],elList,shareMoves=False)
createGraph(elList,pBuck,'graph-viz/pokeGraphNoLegend.gv',graphLabel='Optimal Unique Team/Moves No Legendaries: '+str(round(compute_win([i[0] for i in P[-6:]]),2))+"%")

PL = getSortedElList(elList,noLegend=False)

pBuckL = getBestMoveSet([i[0] for i in PL[-6:]],elList,shareMoves=False)
createGraph(elList,pBuckL,'graph-viz/pokeGraphWithLegend.gv',graphLabel='Optimal Unique Team/Moves with Legendaries: '+str(round(compute_win([i[0] for i in PL[-6:]]),2))+"%")

listOfImageNames = ['graph-viz/pokeGraphNoLegend.gv.png','graph-viz/pokeGraphWithLegend.gv.png']

for imageName in listOfImageNames:
    display(Image(filename=imageName))

Conclusion and Insights

After doing this project, I learned a lot about what it takes to design a battle system that cannot be exposed in video games. The game designers did a decent job on balancing out abilities, moves, and elemental power distribution across the 800+ Pokemon. Even after exploiting the correlation between speed and attack to winning percentage, I was only able to get my "most optimal team" winning percentage to about 2/3 of battles. Of course there are several other things that could be contributed into calculating a winning percentage per pokemon such as tactics, special stats, abilities, etc.. but this tutorial gave a decent insight on what pokemon features stand out the most.

I love Pokemon, and I always have since I was a kid. I am so glad I was able to use my data science skills as a Pokemon Professor to find an optimal Pokemon team composition. If you are just a causual player, take advantage of the custom team composition calculator and see how well your team is able to counter all elemental types! This tutorial was a load of fun to make and I hope it has some uses to all Pokemon Trainers!